Semantic Web Improved with the Weighted IDF Feature
نویسنده
چکیده
The development of search engines is taking at a very fast rate. A lot of algorithms have been tried and tested. But, still the people are not getting precise results. Social networking sites are developing at tremendous rate and their growth has given birth to the new interesting problems. The social networking sites use semantic data to enhance the results. This provides us with a new perspective on how to improve the quality of information retrieval. As we are aware, many techniques of text classification are based on TFIDF algorithm. Term weighting has a significant role in classifying a text document. In this paper, firstly, we are extending the queries by “keyword+tags” instead of keywords only. In addition to this, secondly, we have developed a new ranking algorithm (JEKS algorithm) based on semantic tags from user feedback that uses CiteUlike data. The algorithm enhances the already existing semantic web by using the weighted IDF feature of the TFIDF algorithm. The suggested algorithm provides a better ranking than Google and can be viewed as a semantic web service in the domain of academics. Keywords—Text classification; Semantic Web with weighted idf feature; Expanded query; New Semantic Web Algorithm; Ranking Algorithm
منابع مشابه
Semantic Web Improved with the Weighted IDF Feature and the Class Information
The development of search engines is taking at a very fast rate. Different algorithms have been tried and tested. Still the results are not precise. Social networking sites are developing at tremendous rate and their growth has given birth to the new interesting problems. The social networking sites use semantic data to enhance the results. This provides us with a new perspective on how to impr...
متن کاملMeasuring Effectiveness of Text-Decorated HTML Tags in Web Document Clustering
Web document analysis, and its associated research, underpins much of what is referred to as web intelligence and the envisaged ‘semantic web’. A key issue in this field is how to encode a web document from the raft of potential document “features” without losing salient information. Current research almost always uses word-based feature vectors such as term frequency of specific words (TF) and...
متن کاملA Novel Weighted Phrase-Based Similarity for Web Documents Clustering
Phrase has been considered as a more informative feature term for improving the effectiveness of document clustering. In this paper, a weighted phrase-based document similarity is proposed to compute the pairwise similarities of documents based on the Weighted Suffix Tree Document (WSTD) model. The weighted phrase-based document similarity is applied to the Group-average Hierarchical Agglomerat...
متن کاملUser Multi-interest Modeling Based on Semantic Similar Network in Personalized Information Retrieval
People spend far more time searching information over the Internet than using it, because the desired information is often buried within a long list of searched results. Personalized internet access is a feasible solution to solve this search vs. use dilemma, which helps identify the web documents users truly need. A user’s interests are usually represented by a profile. In this research, an im...
متن کاملWSDL Retrieval for Web Services Based on Hybrid SLVM
Recently, two operable WSDL retrieval approaches, bipartite-graph matching and KbSM, were developed for Web service discovery. But their models and similarity metrics of WSDL ignore some term or semantic feature, and involve formal method problem of representation or difficulty of parameter verification. SLVM approaches depend on statistical term measures to implement XML document representatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015